Skip to content

Add LLM fallback chains, multi-LLM init wizard, and OpenAI OAuth#7

Merged
initializ-mk merged 2 commits into
mainfrom
core/llm-fallback
Feb 25, 2026
Merged

Add LLM fallback chains, multi-LLM init wizard, and OpenAI OAuth#7
initializ-mk merged 2 commits into
mainfrom
core/llm-fallback

Conversation

@initializ-mk
Copy link
Copy Markdown
Contributor

Summary

  • LLM Fallback Chains: Adds FallbackChain (implements llm.Client) with cooldown tracking and error classification. When the primary provider fails (rate limit, auth error, server error), requests automatically failover to the next provider in the chain.
  • Runtime Integration: ResolveModelConfig resolves fallback providers from forge.yaml, FORGE_MODEL_FALLBACKS env var, or auto-detection from available API keys. Runner builds a FallbackChain wrapping primary + fallback clients.
  • Multi-LLM Init Wizard: New FallbackStep in the TUI wizard lets users select fallback providers and enter API keys during forge init. Also supports --fallbacks flag for non-interactive mode.
  • OpenAI OAuth: Browser-based login (Authorization Code + PKCE) via auth.openai.com with a local callback server. OAuthClient auto-refreshes expired tokens. ResponsesClient implements llm.Client for the OpenAI Responses API (required by ChatGPT OAuth/Codex endpoint).
  • Model Selection: Provider-specific model choices during init based on auth method (OAuth vs API key), with friendly display names.
  • Bug Fixes: OAuth/API-key routing (prevents Codex endpoint misuse when API key is configured), multi-select UX (auto-check cursor item on Enter).

Test plan

  • cd forge-core && go test ./llm/... ./runtime/... — fallback chain, cooldown, errors, config resolution, OAuth packages
  • cd forge-cli && go build ./... — compilation check
  • Run forge init, select a primary provider, verify fallback step appears, add a fallback, verify forge.yaml includes fallbacks section
  • Run forge init, select OpenAI, choose "Login with OpenAI", verify browser opens and OAuth completes
  • Configure OpenAI with API key, verify it uses Chat Completions (not Codex Responses endpoint)
  • Set both ANTHROPIC_API_KEY and OPENAI_API_KEY, verify auto-detected fallback in startup banner
  • forge init my-agent --model-provider anthropic --fallbacks openai,gemini --non-interactive — verify forge.yaml has fallbacks

- Fallback chain: FallbackChain, CooldownTracker, error classification
  in forge-core/llm/ with full test coverage
- Runtime integration: ResolveModelConfig resolves fallbacks from
  forge.yaml, FORGE_MODEL_FALLBACKS env var, and auto-detection
- Runner builds FallbackChain wrapping primary + fallback clients
- TUI wizard: FallbackStep for selecting fallback providers and
  collecting API keys during forge init
- OpenAI OAuth: browser-based login (Authorization Code + PKCE) via
  auth.openai.com with local callback server
- ResponsesClient: new llm.Client for the OpenAI Responses API used
  by ChatGPT OAuth tokens (Codex endpoint)
- OAuthClient: auto-refreshing wrapper that detects token expiry
- Model selection: provider-specific model choices during init based
  on auth method (OAuth vs API key)
- Fix OAuth/API-key routing: only use stored OAuth credentials when
  no API key is configured, preventing Codex endpoint misuse
- Fix multi-select UX: auto-check cursor item on Enter when nothing
  is checked, so single-selection works intuitively
Update TestBuildTemplateData_DefaultModels to expect gpt-5.2-2025-12-11
instead of the old gpt-4o-mini default.
@initializ-mk initializ-mk merged commit 1533a15 into main Feb 25, 2026
9 checks passed
naveen-kurra pushed a commit to naveen-kurra/forge that referenced this pull request May 23, 2026
…iz#7)

Reviewer asked: is the egress matcher port-agnostic? If yes, the
current TestAuthDomains_PortStripped behavior is correct; if no, an
issuer on :8443 is silently blocked at JWKS-fetch time.

Investigation result: YES, the egress matcher IS port-agnostic. All
three callsites strip the port off the outbound host before passing
to matcher.IsAllowed:

  - egress_enforcer.go:40 — req.URL.Hostname() before IsAllowed
  - egress_proxy.go:210   — extractHost() via net.SplitHostPort
  - safe_dialer.go:42     — net.SplitHostPort(addr)

So auth_domains.go emitting hostname-only entries is correct and
consistent. No code fix needed.

What WAS missing: the cross-package contract was implicit. If someone
later hardens the matcher to be port-aware ("allow :443 but not :8080"),
auth_domains.go would silently break — every forge.yaml that points at
a non-443 IdP would suddenly fail JWKS fetches.

This commit:
  - Adds a doc comment on hostFromURL() naming the three callsites
    whose port-stripping behavior we depend on. If any of them changes,
    auth_domains needs to follow.
  - Adds TestAuthDomains_AssumesPortAgnosticMatcher that asserts the
    contract directly: builds a matcher with the host AuthDomains
    produces, checks the hostname-only form is allowed, AND checks
    that host:port form is NOT (yet) allowed. If a future change makes
    the matcher port-aware, the second assertion flips and the test
    flags the cross-package break.

No production code changes. The fix is preventative documentation +
test, not behavior change.

Verification:
  go test -race ./forge-core/security/ ./forge-core/auth/... — green
  full sweep — all packages pass
  golangci-lint v2.10.1 — 0 issues
naveen-kurra pushed a commit to naveen-kurra/forge that referenced this pull request May 23, 2026
…ializ#12)

Six coverage gaps and one hardening item from the reviewer's eight-point
batch. Two items in the original list are already covered:

  12.2 (--no-auth + YAML auth precedence) is exercised by the four
       TestResolveAuth_NoAuth* tests added in review initializ#4 commit b33522d.
  12.4 (TestAuthDomains in forge-core/security) is exercised by the
       eight TestAuthDomains_* tests in security/auth_domains_test.go
       (added in PR3 + strengthened in review initializ#7 commit cf4ae14).

The six real gaps:

12.1  Warning capture for --auth-url + YAML auth precedence
  TestBuildUserAuthChain_BothSourcesPrefersYAML was asserting the
  chain-behavior half of the contract through a nop logger that
  swallowed the Warn — silent regression on the "we logged a warning"
  half. Switched to recordLogger and added an assertion on
  "--auth-url and forge.yaml". If a refactor ever drops that warning,
  the test fails.

12.3  Full resolveAuth + YAML loopback head-of-chain test
  auth_chain_test.go exercised buildLegacyAuthChain. Nothing exercised
  resolveAuth() end-to-end with a YAML auth: block AND a minted
  internal token, then asserted "loopback is at chain head, ahead of
  YAML provider." Added TestResolveAuth_YAMLAuth_LoopbackPrependedAtChainHead
  that inspects chain.Providers() ordering directly and smoke-tests
  the loopback token verifies with the "internal" source.

12.5  Backend contract test for the frontend's claim_map translation
  app.js's buildAuthPayload translates flat groups_claim → nested
  claim_map.groups before submitting. The backend doesn't run that
  translation; if the frontend ever regresses, downstream YAML would
  silently drop the custom claim. Added two tests in the UI handler:
  - TestHandleCreateAgent_NestedClaimMapShape pins what the backend
    EXPECTS post-translation (claim_map.groups present).
  - TestHandleCreateAgent_FlatGroupsClaim_PreservedAsExtraField
    documents the failure mode if the JS regresses — the request
    succeeds structurally but claim_map is missing.
  Real JS tests would require a Node toolchain we deliberately don't
  add (see PR6 / review initializ#8). These Go tests are the next-best contract.

12.6  Backspace-to-picker behavior in step_auth.go
  Added TestAuthStep_BackspaceOnEmptyInputReturnsToPicker and a
  counterpart TestAuthStep_BackspaceWithContentDoesNotReturnToPicker
  to pin the escape-hatch behavior the review #11e hint advertises.

12.7  Invalid issuer URL blocks advance
  validateHTTPSURL was unit-tested standalone; never exercised as a
  wizard interaction. Added TestAuthStep_InvalidIssuerURLBlocksAdvance:
  bad URL + Enter → still on issuer phase; type a valid replacement
  + Enter → advances to audience.

12.8  renderAuthBlock YAML quoting hardening
  writeYAMLMap previously emitted "key: value" raw. Values containing
  ": " sequences, "#" markers, leading YAML-significant chars
  (!, *, -, [, {, etc.), reserved tokens (true/false/null/yes/no…),
  or control chars could break the parser. Added:
  - yamlScalar/needsYAMLQuoting helpers that emit double-quoted +
    escaped strings only when the input would otherwise change
    meaning. Plain values pass through unchanged so generated
    forge.yaml remains readable.
  - TestRenderAuthBlock_QuotesUnsafeValues (19 sub-cases) pinning
    the predicate's classification per input shape.
  - TestRenderAuthBlock_QuotedValueIsParseable round-trips a value
    containing ": ", "#", "!", and quotes through the renderer →
    yaml.v3 unmarshaler, asserting the rendered line is valid YAML
    AND decodes back to the original string.

Note: the renderer remains a focused subset (string / number / bool /
map). If auth-settings ever grow richer types, switch to yaml.v3 for
the whole block.

Verification:
  test -z "$(gofmt -l forge-core forge-cli forge-plugins)" — clean
  go test -race ./forge-core/... ./forge-cli/... — all green
  golangci-lint v2.10.1 — 0 issues
naveen-kurra pushed a commit to naveen-kurra/forge that referenced this pull request May 24, 2026
Batch-clearing the "don't block merge" follow-ups from review of PR initializ#79.

initializ#1 gcp_iap classifyJWTErr — use jwt v5 sentinels via errors.Is rather
   than substring matching (library wording shifts across patches;
   sentinels are public API). Special-case ErrTokenSignatureInvalid
   to split alg-confusion (→ ErrInvalidToken) from real bad-signature
   (→ ErrTokenRejected) because golang-jwt wraps both under that one
   sentinel. Three internal keyFunc message-matches retained — those
   are strings WE control, not the library's.

initializ#2 gcp_iap JWKS merge-on-success — switched j.keys = newKeys to a
   per-kid merge. A partial-but-valid JWKS response (e.g. one kid
   accidentally omitted by GCP during rotation) no longer drops kids
   the stale-grace contract assumes we still have. Worst case is
   keeping a retired kid in cache; verification still fails naturally
   for any token signed with the retired private key.

initializ#3 azure_ad GraphCache defensive copies — Get returns
   append([]string(nil), e.groups...) and Put stores a copy of its
   input. Caller mutating Identity.Groups (the auth.Identity layer
   treats it as freely mutable) can't poison the cache.

initializ#4 forge-cli needsYAMLQuoting numeric edge cases — quote anything
   that resembles a YAML number (hex 0x, octal 0o, binary 0b,
   leading-zero "010", scientific 1e10, decimal float 3.14, signed
   ±N, .inf / .nan in either case). Auth-setting values rarely hit
   these shapes but the docstring promised "false negatives are
   bugs" and the Web UI POST path can supply arbitrary strings.
   Added looksNumeric() helper with separate allHexDigits /
   allOctalDigits / allBinaryDigits gates.

initializ#5 aws_sigv4 identity_cache_test — replaced string(rune(i)) with
   strconv.Itoa(i). Surrogate code points (0xD800..0xDFFF) all map
   to U+FFFD, so the eviction-threshold test was silently building
   ~10 distinct keys instead of 10_001 and the sweep never ran.

initializ#6 http.NewRequestWithContext error handling — fixed the two
   `req, _ := ...` antipatterns in gcp_iap/iap_jwks.go and
   azure_ad/graph_client.go. Hardcoded URLs make the failure currently
   unreachable, but errcheck-clean is the discipline.

initializ#7 gcp_iap HS256-with-EC-public-key alg-confusion test — pinned the
   most dangerous attack shape: attacker fetches the verifier's public
   key from JWKS (open by design), uses raw X/Y bytes as the HMAC
   "secret", signs an HS256 token. A non-whitelisting verifier would
   HMAC-verify it. Our keyFunc rejects on alg != "ES256" BEFORE key
   lookup; this test confirms.

Tests added: TestGraphCache_GetReturnsDefensiveCopy,
TestGraphCache_PutStoresDefensiveCopy,
TestProvider_HS256WithECPublicKeyAsSecret_Rejected. Existing
TestProvider_RS256Token_Rejected still passes (alg-confusion still
classified as ErrInvalidToken under the new sentinel-based path).

42 packages green, lint + gofmt clean.
initializ-mk pushed a commit that referenced this pull request May 24, 2026
Batch-clearing the "don't block merge" follow-ups from review of PR #79.

#1 gcp_iap classifyJWTErr — use jwt v5 sentinels via errors.Is rather
   than substring matching (library wording shifts across patches;
   sentinels are public API). Special-case ErrTokenSignatureInvalid
   to split alg-confusion (→ ErrInvalidToken) from real bad-signature
   (→ ErrTokenRejected) because golang-jwt wraps both under that one
   sentinel. Three internal keyFunc message-matches retained — those
   are strings WE control, not the library's.

#2 gcp_iap JWKS merge-on-success — switched j.keys = newKeys to a
   per-kid merge. A partial-but-valid JWKS response (e.g. one kid
   accidentally omitted by GCP during rotation) no longer drops kids
   the stale-grace contract assumes we still have. Worst case is
   keeping a retired kid in cache; verification still fails naturally
   for any token signed with the retired private key.

#3 azure_ad GraphCache defensive copies — Get returns
   append([]string(nil), e.groups...) and Put stores a copy of its
   input. Caller mutating Identity.Groups (the auth.Identity layer
   treats it as freely mutable) can't poison the cache.

#4 forge-cli needsYAMLQuoting numeric edge cases — quote anything
   that resembles a YAML number (hex 0x, octal 0o, binary 0b,
   leading-zero "010", scientific 1e10, decimal float 3.14, signed
   ±N, .inf / .nan in either case). Auth-setting values rarely hit
   these shapes but the docstring promised "false negatives are
   bugs" and the Web UI POST path can supply arbitrary strings.
   Added looksNumeric() helper with separate allHexDigits /
   allOctalDigits / allBinaryDigits gates.

#5 aws_sigv4 identity_cache_test — replaced string(rune(i)) with
   strconv.Itoa(i). Surrogate code points (0xD800..0xDFFF) all map
   to U+FFFD, so the eviction-threshold test was silently building
   ~10 distinct keys instead of 10_001 and the sweep never ran.

#6 http.NewRequestWithContext error handling — fixed the two
   `req, _ := ...` antipatterns in gcp_iap/iap_jwks.go and
   azure_ad/graph_client.go. Hardcoded URLs make the failure currently
   unreachable, but errcheck-clean is the discipline.

#7 gcp_iap HS256-with-EC-public-key alg-confusion test — pinned the
   most dangerous attack shape: attacker fetches the verifier's public
   key from JWKS (open by design), uses raw X/Y bytes as the HMAC
   "secret", signs an HS256 token. A non-whitelisting verifier would
   HMAC-verify it. Our keyFunc rejects on alg != "ES256" BEFORE key
   lookup; this test confirms.

Tests added: TestGraphCache_GetReturnsDefensiveCopy,
TestGraphCache_PutStoresDefensiveCopy,
TestProvider_HS256WithECPublicKeyAsSecret_Rejected. Existing
TestProvider_RS256Token_Rejected still passes (alg-confusion still
classified as ErrInvalidToken under the new sentinel-based path).

42 packages green, lint + gofmt clean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant